110 research outputs found

    A Generalization of the Convex Kakeya Problem

    Full text link
    Given a set of line segments in the plane, not necessarily finite, what is a convex region of smallest area that contains a translate of each input segment? This question can be seen as a generalization of Kakeya's problem of finding a convex region of smallest area such that a needle can be rotated through 360 degrees within this region. We show that there is always an optimal region that is a triangle, and we give an optimal \Theta(n log n)-time algorithm to compute such a triangle for a given set of n segments. We also show that, if the goal is to minimize the perimeter of the region instead of its area, then placing the segments with their midpoint at the origin and taking their convex hull results in an optimal solution. Finally, we show that for any compact convex figure G, the smallest enclosing disk of G is a smallest-perimeter region containing a translate of every rotated copy of G.Comment: 14 pages, 9 figure

    Minimum message length inference of secondary structure from protein coordinate data

    Get PDF
    Motivation: Secondary structure underpins the folding pattern and architecture of most proteins. Accurate assignment of the secondary structure elements is therefore an important problem. Although many approximate solutions of the secondary structure assignment problem exist, the statement of the problem has resisted a consistent and mathematically rigorous definition. A variety of comparative studies have highlighted major disagreements in the way the available methods define and assign secondary structure to coordinate data

    Towards Reliable Automatic Protein Structure Alignment

    Full text link
    A variety of methods have been proposed for structure similarity calculation, which are called structure alignment or superposition. One major shortcoming in current structure alignment algorithms is in their inherent design, which is based on local structure similarity. In this work, we propose a method to incorporate global information in obtaining optimal alignments and superpositions. Our method, when applied to optimizing the TM-score and the GDT score, produces significantly better results than current state-of-the-art protein structure alignment tools. Specifically, if the highest TM-score found by TMalign is lower than (0.6) and the highest TM-score found by one of the tested methods is higher than (0.5), there is a probability of (42%) that TMalign failed to find TM-scores higher than (0.5), while the same probability is reduced to (2%) if our method is used. This could significantly improve the accuracy of fold detection if the cutoff TM-score of (0.5) is used. In addition, existing structure alignment algorithms focus on structure similarity alone and simply ignore other important similarities, such as sequence similarity. Our approach has the capacity to incorporate multiple similarities into the scoring function. Results show that sequence similarity aids in finding high quality protein structure alignments that are more consistent with eye-examined alignments in HOMSTRAD. Even when structure similarity itself fails to find alignments with any consistency with eye-examined alignments, our method remains capable of finding alignments highly similar to, or even identical to, eye-examined alignments.Comment: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013

    VIPERdb2: an enhanced and web API enabled relational database for structural virology

    Get PDF
    VIPERdb (http://viperdb.scripps.edu) is a relational database and a web portal for icosahedral virus capsid structures. Our aim is to provide a comprehensive resource specific to the needs of the virology community, with an emphasis on the description and comparison of derived data from structural and computational analyses of the virus capsids. In the current release, VIPERdb2, we implemented a useful and novel method to represent capsid protein residues in the icosahedral asymmetric unit (IAU) using azimuthal polar orthographic projections, otherwise known as Φ–Ψ (Phi–Psi) diagrams. In conjunction with a new Application Programming Interface (API), these diagrams can be used as a dynamic interface to the database to map residues (categorized as surface, interface and core residues) and identify family wide conserved residues including hotspots at the interfaces. Additionally, we enhanced the interactivity with the database by interfacing with web-based tools. In particular, the applications Jmol and STRAP were implemented to visualize and interact with the virus molecular structures and provide sequence–structure alignment capabilities. Together with extended curation practices that maintain data uniformity, a relational database implementation based on a schema for macromolecular structures and the APIs provided will greatly enhance the ability to do structural bioinformatics analysis of virus capsids

    Tableau-based protein substructure search using quadratic programming

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Searching for proteins that contain similar substructures is an important task in structural biology. The exact solution of most formulations of this problem, including a recently published method based on tableaux, is too slow for practical use in scanning a large database.</p> <p>Results</p> <p>We developed an improved method for detecting substructural similarities in proteins using tableaux. Tableaux are compared efficiently by solving the quadratic program (QP) corresponding to the quadratic integer program (QIP) formulation of the extraction of maximally-similar tableaux. We compare the accuracy of the method in classifying protein folds with some existing techniques.</p> <p>Conclusion</p> <p>We find that including constraints based on the separation of secondary structure elements increases the accuracy of protein structure search using maximally-similar subtableau extraction, to a level where it has comparable or superior accuracy to existing techniques. We demonstrate that our implementation is able to search a structural database in a matter of hours on a standard PC.</p

    Fast and accurate protein substructure searching with simulated annealing and GPUs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Searching a database of protein structures for matches to a query structure, or occurrences of a structural motif, is an important task in structural biology and bioinformatics. While there are many existing methods for structural similarity searching, faster and more accurate approaches are still required, and few current methods are capable of substructure (motif) searching.</p> <p>Results</p> <p>We developed an improved heuristic for tableau-based protein structure and substructure searching using simulated annealing, that is as fast or faster and comparable in accuracy, with some widely used existing methods. Furthermore, we created a parallel implementation on a modern graphics processing unit (GPU).</p> <p>Conclusions</p> <p>The GPU implementation achieves up to 34 times speedup over the CPU implementation of tableau-based structure search with simulated annealing, making it one of the fastest available methods. To the best of our knowledge, this is the first application of a GPU to the protein structural search problem.</p

    Solution Structure and Phylogenetics of Prod1, a Member of the Three-Finger Protein Superfamily Implicated in Salamander Limb Regeneration

    Get PDF
    Prod1 is a cell-surface molecule of the three-finger protein (TFP) superfamily involved in the specification of newt limb PD identity. The TFP superfamily is a highly diverse group of metazoan proteins that includes snake venom toxins, mammalian transmembrane receptors and miscellaneous signaling molecules..The available data suggest that Prod1, and thereby its role in encoding PD identity, is restricted to salamanders. The lack of comparable limb-regenerative capability in other adult vertebrates could be correlated with the absence of the Prod1 gene

    A Mathematical Framework for Protein Structure Comparison

    Get PDF
    Comparison of protein structures is important for revealing the evolutionary relationship among proteins, predicting protein functions and predicting protein structures. Many methods have been developed in the past to align two or multiple protein structures. Despite the importance of this problem, rigorous mathematical or statistical frameworks have seldom been pursued for general protein structure comparison. One notable issue in this field is that with many different distances used to measure the similarity between protein structures, none of them are proper distances when protein structures of different sequences are compared. Statistical approaches based on those non-proper distances or similarity scores as random variables are thus not mathematically rigorous. In this work, we develop a mathematical framework for protein structure comparison by treating protein structures as three-dimensional curves. Using an elastic Riemannian metric on spaces of curves, geodesic distance, a proper distance on spaces of curves, can be computed for any two protein structures. In this framework, protein structures can be treated as random variables on the shape manifold, and means and covariance can be computed for populations of protein structures. Furthermore, these moments can be used to build Gaussian-type probability distributions of protein structures for use in hypothesis testing. The covariance of a population of protein structures can reveal the population-specific variations and be helpful in improving structure classification. With curves representing protein structures, the matching is performed using elastic shape analysis of curves, which can effectively model conformational changes and insertions/deletions. We show that our method performs comparably with commonly used methods in protein structure classification on a large manually annotated data set
    corecore